10 research outputs found

    Extensions of graphical models with applications in genetics and genomics

    Get PDF

    De novo construction of polyploid linkage maps using discrete graphical models

    Full text link
    Linkage maps are used to identify the location of genes responsible for traits and diseases. New sequencing techniques have created opportunities to substantially increase the density of genetic markers. Such revolutionary advances in technology have given rise to new challenges, such as creating high-density linkage maps. Current multiple testing approaches based on pairwise recombination fractions are underpowered in the high-dimensional setting and do not extend easily to polyploid species. We propose to construct linkage maps using graphical models either via a sparse Gaussian copula or a nonparanormal skeptic approach. Linkage groups (LGs), typically chromosomes, and the order of markers in each LG are determined by inferring the conditional independence relationships among large numbers of markers in the genome. Through simulations, we illustrate the utility of our map construction method and compare its performance with other available methods, both when the data are clean and contain no missing observations and when data contain genotyping errors and are incomplete. We apply the proposed method to two genotype datasets: barley and potato from diploid and polypoid populations, respectively. Our comprehensive map construction method makes full use of the dosage SNP data to reconstruct linkage map for any bi-parental diploid and polyploid species. We have implemented the method in the R package netgwas.Comment: 25 pages, 7 figure

    Extensions of graphical models with applications in genetics and genomics

    Get PDF

    Extensions of graphical models with applications in genetics and genomics

    Get PDF
    De levende cel is een complex systeem van interacterende moleculen, waarin genen gekopieerd worden naar RNA's en vertaald in eiwitten. De meeste biologische karakteristieken komen voort uit complexe interacties tussen de talrijke componenten van een cel. Een belangrijke uitdaging voor de biologie is daarom het begrijpen van de structuur en de dynamica van het complexe inter- en intra-cellulaire web van interacties die bijdragen aan de structuur en de werking van een levende cel. %Een belangrijke uitdaging voor de biologie is daarom het begrijpen van de structuur en de dynamica van het complexe web van interacties, tussen en binnen de cellen, die bijdragen aan de structuur en de werking van een levende cel. Het gedrag van de meeste complexe systemen, van een cel tot Internet, komt voort uit de activiteit van vele componenten die paarsgewijs op elkaar inwerken. Op een abstract niveau kunnen deze componenten gerepresenteerd worden door een reeks knopen die met elkaar verbonden zijn door takken, waar elke tak de interactie tussen twee componenten laat zien. De knopen en takken samen vormen een netwerk, of, in formelere taal, een graaf. De doelstellingen van dit werk waren het uitbreiden van grafische modellen voor verschillende datastructuren en het vergroten van de toepasbaarheid van grafische modellen in diverse gebieden, in het bijzonder in systeemgenetica. In dit proefschrift hebben we een methode ontwikkeld, gebaseerd op ongerichte grafische modellen, om directe relaties tussen componenten van een systeem af te leiden. Daarnaast hebben we grafische modellen uitgebreid tot hoogdimensionale tijdseriedata met een niet-Gaussische structuur, waarbij we gerichte en ongerichte grafische modellen hebben gecombineerd om dynamische en gelijktijdige interacties te onderzoeken. We hebben de voorgestelde methoden ge"implementeerd als gebruiksvriendelijke software, genaamd netgwas, en tsnetwork welke vrij toegankelijk is voor gebruikers

    netgwas: An R Package for Network-Based Genome-Wide Association Studies

    Full text link
    Graphical models are powerful tools for modeling and making statistical inferences regarding complex associations among variables in multivariate data. In this paper we introduce the R package netgwas, which is designed based on undirected graphical models to accomplish three important and interrelated goals in genetics: constructing linkage map, reconstructing linkage disequilibrium (LD) networks from multi-loci genotype data, and detecting high-dimensional genotype-phenotype networks. The netgwas package deals with species with any chromosome copy number in a unified way, unlike other software. It implements recent improvements in both linkage map construction (Behrouzi and Wit, 2018), and reconstructing conditional independence network for non-Gaussian continuous data, discrete data, and mixed discrete-and-continuous data (Behrouzi and Wit, 2017). Such datasets routinely occur in genetics and genomics such as genotype data, and genotype-phenotype data. We demonstrate the value of our package functionality by applying it to various multivariate example datasets taken from the literature. We show, in particular, that our package allows a more realistic analysis of data, as it adjusts for the effect of all other variables while performing pairwise associations. This feature controls for spurious associations between variables that can arise from classical multiple testing approach. This paper includes a brief overview of the statistical methods which have been implemented in the package. The main body of the paper explains how to use the package. The package uses a parallelization strategy on multi-core processors to speed-up computations for large datasets. In addition, it contains several functions for simulation and visualization. The netgwas package is freely available at https://cran.r-project.org/web/packages/netgwasComment: 32 pages, 9 figures; due to the limitation "The abstract field cannot be longer than 1,920 characters", the abstract appearing here is slightly shorter than that in the PDF fil

    A Spatial Autoregressive Graphical Model with Applications in Intercropping

    Full text link
    Within the statistical literature, there is a lack of methods that allow for asymmetric multivariate spatial effects to model relations underlying complex spatial phenomena. Intercropping is one such phenomenon. In this ancient agricultural practice multiple crop species or varieties are cultivated together in close proximity and are subject to mutual competition. To properly analyse such a system, it is necessary to account for both within- and between-plot effects, where between-plot effects are asymmetric. Building on the multivariate spatial autoregressive model and the Gaussian graphical model, the proposed method takes asymmetric spatial relations into account, thereby removing some of the limiting factors of spatial analyses and giving researchers a better indication of the existence and extend of spatial relationships. Using a Bayesian-estimation framework, the model shows promising results in the simulation study. The model is applied on intercropping data consisting of Belgian endive and beetroot, illustrating the usage of the proposed methodology. An R package containing the proposed methodology can be found on https:// CRAN.R-project.org/package=SAGM

    Reconstruction of Networks with Direct and Indirect Genetic Effects

    Get PDF
    Genetic variance of a phenotypic trait can originate from direct genetic effects, or from indirect effects, i.e., through genetic effects on other traits, affecting the trait of interest. This distinction is often of great importance, for example, when trying to improve crop yield and simultaneously control plant height. As suggested by Sewall Wright, assessing contributions of direct and indirect effects requires knowledge of (1) the presence or absence of direct genetic effects on each trait, and (2) the functional relationships between the traits. Because experimental validation of such relationships is often unfeasible, it is increasingly common to reconstruct them using causal inference methods. However, most current methods require all genetic variance to be explained by a small number of quantitative trait loci (QTL) with fixed effects. Only a few authors have considered the “missing heritability” case, where contributions of many undetectable QTL are modeled with random effects. Usually, these are treated as nuisance terms that need to be eliminated by taking residuals from a multi-trait mixed model (MTM). But fitting such an MTM is challenging, and it is impossible to infer the presence of direct genetic effects. Here, we propose an alternative strategy, where genetic effects are formally included in the graph. This has important advantages: (1) genetic effects can be directly incorporated in causal inference, implemented via our PCgen algorithm, which can analyze many more traits; and (2) we can test the existence of direct genetic effects, and improve the orientation of edges between traits. Finally, we show that reconstruction is much more accurate if individual plant or plot data are used, instead of genotypic means. We have implemented the PCgen-algorithm in the R-package pcgen.</p

    Natural variation in salt-induced root growth phases and their contribution to root architecture plasticity

    No full text
    The root system architecture of a plant changes during salt stress exposure. Different accessions of Arabidopsis thaliana have adopted different strategies in remodelling their root architecture during salt stress. Salt induces a multiphase growth response in roots, consisting of a stop phase, quiescent phase, recovery phase and eventually a new level of homoeostasis. We explored natural variation in the length of and growth rate during these phases in both main and lateral roots and find that some accessions lack the quiescent phase. Using mathematical models and correlation-based network, allowed us to correlate dynamic traits to overall root architecture and discover that both the main root growth rate during homoeostasis and lateral root appearance are the strongest determinants of overall root architecture. In addition, this approach revealed a trade-off between investing in main or lateral root length during salt stress. By studying natural variation in high-resolution temporal root growth using mathematical modelling, we gained new insights in the interactions between dynamic root growth traits and we identified key traits that modulate overall root architecture during salt stress

    Dietary Intakes of Vegetable Protein, Folate, and Vitamins B-6 and B-12 Are Partially Correlated with Physical Functioning of Dutch Older Adults Using Copula Graphical Models

    No full text
    Background: In nutritional epidemiology, dealing with confounding and complex internutrient relations are major challenges. An often-used approach is dietary pattern analyses, such as principal component analysis, to deal with internutrient correlations, and to more closely resemble the true way nutrients are consumed. However, despite these improvements, these approaches still require subjective decisions in the preselection of food groups. Moreover, they do not make efficient use of multivariate dietary data, because they detect only marginal associations. We propose the use of copula graphical models (CGMs) to model and make statistical inferences regarding complex associations among variables in multivariate data, where associations between all variables can be learned simultaneously. Objective: We aimed to reconstruct nutritional intake and physical functioning networks in Dutch older adults by applying a CGM. Methods: We addressed this issue by uncovering the pairwise associations between variables while correcting for the effect of remaining variables. More specifically, we used a CGM to infer the precision matrix, which contains all the conditional independence relations between nodes in the graph. The nonzero elements of the precision matrix indicate the presence of a direct association. We applied this method to reconstruct nutrient-physical functioning networks from the combined data of 4 studies (Nu-Age, ProMuscle, ProMO, and V-Fit, total n = 662, mean ± SD age = 75 ± 7 y). The method was implemented in the R package nutriNetwork which is freely available at https://cran.r-project.org/web/packages/nutriNetwork. Results: Greater intakes of vegetable protein and vitamin B-6 were partially correlated with higher scores on the total Short Physical Performance Battery (SPPB) and the chair rise test. Greater intakes of vitamin B-12 and folate were partially correlated with higher scores on the chair rise test and the total SPPB, respectively. Conclusions: We determined that vegetable protein, vitamin B-6, folate, and vitamin B-12 intakes are partially correlated with improved functional outcome measurements in Dutch older adults.</p

    Dietary Intakes of Vegetable Protein, Folate, and Vitamins B-6 and B-12 Are Partially Correlated with Physical Functioning of Dutch Older Adults Using Copula Graphical Models

    No full text
    Background: In nutritional epidemiology, dealing with confounding and complex internutrient relations are major challenges. An often-used approach is dietary pattern analyses, such as principal component analysis, to deal with internutrient correlations, and to more closely resemble the true way nutrients are consumed. However, despite these improvements, these approaches still require subjective decisions in the preselection of food groups. Moreover, they do not make efficient use of multivariate dietary data, because they detect only marginal associations. We propose the use of copula graphical models (CGMs) to model and make statistical inferences regarding complex associations among variables in multivariate data, where associations between all variables can be learned simultaneously. Objective: We aimed to reconstruct nutritional intake and physical functioning networks in Dutch older adults by applying a CGM. Methods: We addressed this issue by uncovering the pairwise associations between variables while correcting for the effect of remaining variables. More specifically, we used a CGM to infer the precision matrix, which contains all the conditional independence relations between nodes in the graph. The nonzero elements of the precision matrix indicate the presence of a direct association. We applied this method to reconstruct nutrient-physical functioning networks from the combined data of 4 studies (Nu-Age, ProMuscle, ProMO, and V-Fit, total n = 662, mean ± SD age = 75 ± 7 y). The method was implemented in the R package nutriNetwork which is freely available at https://cran.r-project.org/web/packages/nutriNetwork. Results: Greater intakes of vegetable protein and vitamin B-6 were partially correlated with higher scores on the total Short Physical Performance Battery (SPPB) and the chair rise test. Greater intakes of vitamin B-12 and folate were partially correlated with higher scores on the chair rise test and the total SPPB, respectively. Conclusions: We determined that vegetable protein, vitamin B-6, folate, and vitamin B-12 intakes are partially correlated with improved functional outcome measurements in Dutch older adults.</p
    corecore